Out[2]:
| City | Date | PM2.5 | PM10 | NO | NO2 | NOx | NH3 | CO | SO2 | O3 | Benzene | Toluene | Xylene | AQI | AQI_Bucket | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Ahmedabad | 2015-01-01 | NaN | NaN | 0.92 | 18.22 | 17.15 | NaN | 0.92 | 27.64 | 133.36 | 0.00 | 0.02 | 0.00 | NaN | NaN |
| 1 | Ahmedabad | 2015-01-02 | NaN | NaN | 0.97 | 15.69 | 16.46 | NaN | 0.97 | 24.55 | 34.06 | 3.68 | 5.50 | 3.77 | NaN | NaN |
| 2 | Ahmedabad | 2015-01-03 | NaN | NaN | 17.40 | 19.30 | 29.70 | NaN | 17.40 | 29.07 | 30.70 | 6.80 | 16.40 | 2.25 | NaN | NaN |
| 3 | Ahmedabad | 2015-01-04 | NaN | NaN | 1.70 | 18.48 | 17.97 | NaN | 1.70 | 18.59 | 36.08 | 4.43 | 10.14 | 1.00 | NaN | NaN |
| 4 | Ahmedabad | 2015-01-05 | NaN | NaN | 22.10 | 21.42 | 37.76 | NaN | 22.10 | 39.33 | 39.31 | 7.01 | 18.89 | 2.78 | NaN | NaN |
Out[3]:
City 0 Date 0 PM2.5 4598 PM10 11140 NO 3582 NO2 3585 NOx 4185 NH3 10328 CO 2059 SO2 3854 O3 4022 Benzene 5623 Toluene 8041 Xylene 18109 AQI 4681 AQI_Bucket 4681 dtype: int64
Out[4]:
(29531, 16)
Out[8]:
City 0 PM2.5 0 PM10 0 NO 0 NO2 0 NOx 0 NH3 0 CO 0 SO2 0 O3 0 Benzene 0 Toluene 0 Xylene 0 AQI 0 AQI_Bucket 4681 Year 0 Month 0 Day 0 dtype: int64
C:\Users\Vinit Solanki\AppData\Local\Temp\ipykernel_14364\1682559575.py:1: FutureWarning: Series.fillna with 'method' is deprecated and will raise in a future version. Use obj.ffill() or obj.bfill() instead. df['AQI'].fillna(method='ffill', inplace=True) C:\Users\Vinit Solanki\AppData\Local\Temp\ipykernel_14364\1682559575.py:2: FutureWarning: Series.fillna with 'method' is deprecated and will raise in a future version. Use obj.ffill() or obj.bfill() instead. df['AQI_Bucket'].fillna(method='ffill', inplace=True)
Out[11]:
| City | PM2.5 | PM10 | NO | NO2 | NOx | NH3 | CO | SO2 | O3 | Benzene | Toluene | Xylene | AQI | AQI_Bucket | Year | Month | Day | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Ahmedabad | 48.57 | 95.68 | 0.92 | 18.22 | 17.15 | 15.85 | 0.92 | 25.39 | 75.715 | 0.00 | 0.02 | 0.98 | 118.0 | NaN | 2015 | 1 | 1 |
| 1 | Ahmedabad | 48.57 | 95.68 | 0.97 | 15.69 | 16.46 | 15.85 | 0.97 | 24.55 | 34.060 | 3.68 | 5.50 | 0.98 | 118.0 | NaN | 2015 | 1 | 2 |
| 2 | Ahmedabad | 48.57 | 95.68 | 17.40 | 19.30 | 29.70 | 15.85 | 2.64 | 25.39 | 30.700 | 5.69 | 13.13 | 0.98 | 118.0 | NaN | 2015 | 1 | 3 |
| 3 | Ahmedabad | 48.57 | 95.68 | 1.70 | 18.48 | 17.97 | 15.85 | 1.70 | 18.59 | 36.080 | 4.43 | 10.14 | 0.98 | 118.0 | NaN | 2015 | 1 | 4 |
| 4 | Ahmedabad | 48.57 | 95.68 | 22.10 | 21.42 | 37.76 | 15.85 | 2.64 | 25.39 | 39.310 | 5.69 | 13.13 | 0.98 | 118.0 | NaN | 2015 | 1 | 5 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 29526 | Visakhapatnam | 15.02 | 50.94 | 7.68 | 25.06 | 19.54 | 12.47 | 0.47 | 8.55 | 23.300 | 2.24 | 12.07 | 0.98 | 41.0 | Good | 2020 | 6 | 27 |
| 29527 | Visakhapatnam | 24.38 | 74.09 | 3.42 | 26.06 | 16.53 | 11.99 | 0.52 | 12.72 | 30.140 | 0.74 | 2.21 | 0.98 | 70.0 | Satisfactory | 2020 | 6 | 28 |
| 29528 | Visakhapatnam | 22.91 | 65.73 | 3.45 | 29.53 | 18.33 | 10.71 | 0.48 | 8.42 | 30.960 | 0.01 | 0.01 | 0.98 | 68.0 | Satisfactory | 2020 | 6 | 29 |
| 29529 | Visakhapatnam | 16.64 | 49.97 | 4.05 | 29.26 | 18.80 | 10.03 | 0.52 | 9.84 | 28.300 | 0.00 | 0.00 | 0.98 | 54.0 | Satisfactory | 2020 | 6 | 30 |
| 29530 | Visakhapatnam | 15.00 | 66.00 | 0.40 | 26.85 | 14.05 | 5.20 | 0.59 | 2.10 | 17.050 | 1.07 | 2.97 | 0.98 | 50.0 | Good | 2020 | 7 | 1 |
29531 rows × 18 columns
Preprocessing complete. Cleaned dataset saved.
Model Training, Testing and Evaluation
Out[24]:
City 0 PM2.5 0 PM10 0 NO 0 NO2 0 NOx 0 NH3 0 CO 0 SO2 0 O3 0 Benzene 0 Toluene 0 Xylene 0 AQI 0 AQI_Bucket 0 Year 0 Month 0 Day 0 dtype: int64
Out[27]:
LinearRegression()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
LinearRegression()
Model Coefficients: [ 0.66259842 0.0626614 0.02131187 0.0307719 0.02781581 -0.02218568 0.22053025 0.07155377 0.02452041 -0.02309724 0.05487827 0. ] Intercept: -0.0007202162324249466
Actual AQI Predicted AQI 22593 -0.327402 -0.526979 22459 -0.327402 -0.526979 24213 -0.097410 -0.069059 25301 0.544147 0.720180 20886 2.063305 1.556177
Mean Absolute Error (MAE): 0.3165791689386951 Mean Squared Error (MSE): 0.2059085100144702 Root Mean Squared Error (RMSE): 0.4537714292619911 R-Squared (R²): 0.7912012234170357
Date City PM2.5 PM10 NO NO2 NOx \
0 2015-01-04 10.320000 0.087282 0.203731 -0.231066 -0.342167 -0.222933
1 2015-01-11 10.714286 0.098252 0.204767 0.222317 0.187937 0.021110
2 2015-01-18 10.714286 0.098252 0.204767 0.237162 0.042651 0.125678
3 2015-01-25 10.714286 0.095815 0.191041 -0.064450 -0.263890 -0.299929
4 2015-02-01 10.714286 0.225171 0.189906 0.054169 -0.111199 -0.081368
NH3 CO SO2 O3 Benzene Toluene Xylene \
0 -0.112833 0.676313 0.462941 -0.417322 0.062761 0.300881 0.0
1 0.142625 0.638029 0.446622 -0.385076 0.329840 0.345271 0.0
2 0.063202 0.747507 0.321056 -0.275442 0.404165 0.476483 0.0
3 -0.118319 0.967285 0.131151 -0.228494 0.250962 0.020355 0.0
4 -0.094327 0.730538 -0.052651 0.031742 0.187905 -0.052356 0.0
AQI AQI_Bucket Year Month Day
0 -0.028413 2.960000 2015.0 1.000000 2.560000
1 0.014127 2.857143 2015.0 1.000000 8.000000
2 0.014127 2.857143 2015.0 1.000000 15.000000
3 -0.001065 2.795918 2015.0 1.000000 22.000000
4 0.172108 2.551020 2015.0 1.142857 24.571429
Year Month City PM2.5 PM10 NO NO2 NOx \
0 2015 1 10.668224 0.114191 0.198100 0.062805 -0.083797 -0.090199
1 2015 2 10.714286 0.288594 0.183243 -0.052974 -0.012055 0.004397
2 2015 3 10.714286 0.206257 0.062114 -0.124321 -0.106982 0.008914
3 2015 4 10.714286 0.178300 0.135780 -0.149716 -0.146632 -0.072492
4 2015 5 10.714286 0.093103 0.204767 -0.185286 -0.084963 -0.086396
NH3 CO SO2 O3 Benzene Toluene Xylene \
0 -0.008588 0.770963 0.238944 -0.249612 0.264279 0.208517 0.0
1 0.293168 0.671077 -0.035333 0.052416 0.183153 0.187007 0.0
2 0.037991 0.918452 -0.013726 0.257742 0.343649 0.457593 0.0
3 0.089075 0.959977 0.201922 0.225972 0.042688 -0.053902 0.0
4 0.218881 1.057755 0.228192 0.229598 0.269988 -0.152285 0.0
AQI AQI_Bucket Day Date
0 0.030681 2.794393 16.196262 2015-01-16 04:42:37.009345792
1 0.276605 2.413265 14.500000 2015-02-14 12:00:00.000000000
2 0.250786 2.211982 16.000000 2015-03-16 00:00:00.000000000
3 0.296833 2.085714 15.500000 2015-04-15 12:00:00.000000000
4 0.323833 2.193548 16.000000 2015-05-16 00:00:00.000000000
Year City PM2.5 PM10 NO NO2 NOx \
0 2015 11.512674 0.138603 0.111125 -0.053633 -0.182989 0.060780
1 2016 12.687752 0.262115 0.070190 -0.039878 0.086431 -0.108315
2 2017 13.192152 0.175568 0.063968 -0.085808 0.081692 -0.300019
3 2018 13.261938 0.079767 0.155770 0.108936 0.227406 0.224387
4 2019 13.127182 -0.099043 -0.038186 0.059895 -0.015154 0.108187
NH3 CO SO2 O3 Benzene Toluene Xylene \
0 0.235161 0.710767 -0.151728 -0.084679 -0.018131 -0.047299 0.0
1 0.197141 0.064313 -0.260322 0.078213 -0.047449 -0.085234 0.0
2 0.062556 -0.448943 -0.210662 -0.039659 -0.253313 -0.314022 0.0
3 -0.026822 0.093519 0.174511 0.042125 0.072524 0.127385 0.0
4 -0.139493 0.052647 0.163861 -0.047343 0.091275 0.164949 0.0
AQI AQI_Bucket Month Day Date
0 0.252075 2.433417 6.788290 15.761871 2015-07-10 00:34:26.690467584
1 0.215384 2.541691 6.671363 15.761070 2016-07-06 07:32:32.156411648
2 0.059469 2.625507 7.000213 15.874174 2017-07-16 13:38:07.140115200
3 0.091118 2.320507 6.592490 15.738062 2018-07-04 00:50:04.172461824
4 -0.053913 2.152699 6.821246 15.779345 2019-07-11 01:11:10.104754176
Weekly, Monthly, and Yearly AQI data saved successfully!
Out[38]:
0 0.000000 1 -0.327402 2 -0.327402 3 -0.327402 4 -0.327402 Name: ARIMA_Predicted_AQI, dtype: float64
MAE: 0.2340 MSE: 0.1180 RMSE: 0.3435 R² Score: 0.8820
Best Ridge Alpha: 10 Best Lasso Alpha: 0.01 Model: Ridge, MAE: 0.2217, MSE: 0.1163, RMSE: 0.3411, R² Score: 0.8820 Model: Lasso, MAE: 0.2210, MSE: 0.1164, RMSE: 0.3412, R² Score: 0.8819 Model: RandomForest, MAE: 0.1616, MSE: 0.0838, RMSE: 0.2896, R² Score: 0.9150 Model: GradientBoosting, MAE: 0.1764, MSE: 0.0877, RMSE: 0.2961, R² Score: 0.9111
Out[42]:
RandomForestRegressor(random_state=42)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
RandomForestRegressor(random_state=42)
ERROR: Could not find a version that satisfies the requirement pickle (from versions: none) ERROR: No matching distribution found for pickle [notice] A new release of pip is available: 24.0 -> 25.0 [notice] To update, run: python.exe -m pip install --upgrade pip
Model saved successfully!
Requirement already satisfied: plotly in c:\users\vinit solanki\onedrive\documents\vesit files\aqi_prediction\myenv\lib\site-packages (6.0.0) Requirement already satisfied: narwhals>=1.15.1 in c:\users\vinit solanki\onedrive\documents\vesit files\aqi_prediction\myenv\lib\site-packages (from plotly) (1.25.2) Requirement already satisfied: packaging in c:\users\vinit solanki\onedrive\documents\vesit files\aqi_prediction\myenv\lib\site-packages (from plotly) (24.2)
[notice] A new release of pip is available: 24.0 -> 25.0 [notice] To update, run: python.exe -m pip install --upgrade pip